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1.  INTRODUCTION. 

Various  proofs  have  been  given  of  the  minimum  principle  satisfied  by  an  optimal 
control  in  a  partially  observed  stochastic  control  problem.  See,  for  example,  the  papers 
by  Bensoussan  [l],  Elliott  [5],  Haussmann  [7j,  and  the  recent  paper  [9]  by  Haussmann  in 
which  the  adjoint  process  is  identified.  The  simple  case  of  a  partially  observed  Markov 
chain  is  discussed  in  the  University  of  Maryland  lecture  notes  [6]  of  the  second  author. 

'I 

We  show  in  this  article  how  a  minimum  principle  for  a  partially  observed  diffusion 
can  be  obtained  by  differentiating  the  statement  that  a  control  u*  is  optimal.  The  results 
of  Bismut  [2],  [3]  and  Kunita  [10],  bn  stochastic  flows  enable  us  to  compute  in  an  easy  and 
explicit  way  the  change  in  the  cost  due  to  a  ‘strong  variation’  of  an  optimal  control.  The 
only  technical  difficulty  is  the  justification  of  the  differentiation.  As  we  wished  to  exhibit 
the  simplification  obtained  by  using  the  ideas  of  stochastic  flows  the  result  is  not  proved 
under  the  weakest  possible  hypotheses.  Finally,  in  Section  6,  we  show  how  Bensoussan ’s 
minimum  principle  follows  from  our  result  if  the  drift  coefficient  is  differentiable  in  the 
control  variable. 


2.  DYNAMICS. 

Suppose  the  state  of  the  system  is  described  by  a  stochastic  differential  equation 

d£t  =  f{t,£t,u)dt  +  g(t,it)dwt, 

6  €  Rd,  £o  =  *o,  0  <  t  <  T.  (2.1) 

The  control  parameter  u  will  take  values  in  a  compact  subset  U  of  some  Euclidean  space  Rk . 
We  shall  make  the  following  assumptions: 

Ai :  xo  is  given;  if  xo  is  a  random  variable  and  Po  its  distribution  the  situation  when 
/  |x|?Po(dx)  <  °°  for  some  q  >  n  +  1  can  be  treated,  as  in  (9],  by  including  an  extra 
integration  with  respect  to  Pq. 

A2 :  /  :  [0,  T]  x  Rd  x  U  —  Rd  is  Borel  measurable,  continuous  in  u  for  each  (t,x), 
continuously  differentiable  in  x  and  for  some  constant  K 

(!  +  I1!)-1  |/(*,*>«)|  +  |/*(«,a;,u)|  <  K\. 

A3:  g:[0,T]xRd  ^  Rd®Rn  is  a  matrix  valued  function,  Borel  measurable,  continuously 
differentiable  in  x,  and  for  some  constant  K2 

lff(*,z)l  +  \gx{t,x)\  <  K2. 

The  observation  process  is  given  by 

dyt  =  h(Ct)dt  +  dvt 

yt  e  Rm,  yo=o,  o<t<T.  (2.2) 

In  the  above  equations  w  =  (to1, . . .  ,u;n)  and  v  =  [v1 , . . .  ,vd)  are  independent  Brownian 
motions.  We  also  assume 

A4:  h  :  Rd  —  Rm  is  Borel  measurable,  continuously  differentiable  in  x,  and  for  some 
constant  K$ 


|/i(t,x)|  +  |M<,x)|  <  Kz- 


REMARKS  2.1.  These  hypotheses  can  be  weakened.  For  example,  in  A4,  h  can  be 
allowed  linear  growth  in  z.  Because  g  is  bounded  a  delicate  argument  then  implies  the 
exponential  Z  of  (2.3)  is  in  some  IP  space,  1  <  p  <  00.  (See,  for  example,  Theorem  2.2  of 
[8]).  However,  when  h  is  bounded  Z  is  in  all  the  IP  spaces,  (see  Lemma  2.3).  Also,  if  we 
require  /  to  have  linear  growth  in  u  then  the  set  of  control  values  U  can  be  unbounded 
as  in  [9].  Our  objective,  however,  is  not  the  greatest  generality  but  to  demonstrate  the 
simplicity  of  the  techniques  of  stochastic  flows. 

Let  P  denote  Wiener  measure  on  the  C([0,T],/2'1)  and  p  denote  Wiener  measure 
on  C([0, T], Rm).  Consider  the  space  fl  =  C([0, T\,Rn)  x  C([0, T],Rm)  with  coordinate 
functions  (z*,yt)  and  define  Wiener  measure  P  on  fl  by 

P(dx,dy)  =  P(dx)p(dy). 

DEFINITION  2.2.  Write  Y  =  {Yt}  for  the  right  continuous  complete  filtration  on 
C([0,  T],Rm)  generated  by  Yt°  =  o{yt  :  s  <  t}.  The  set  of  admissible  control  functions  U 
will  be  the  Y -predictable  functions  on  [0,  T]  x  C((0, T},Rm)  with  values  in  U. 

For  ueU  and  x  €  Rd  write  (z)  for  the  strong  solution  of  (2.1)  corresponding  to 
control  u,  and  with  (z)  =  z.  Write 

K,  (x)  =exp(jf  h(C,  (x))’dyT  -if  (.({•,  (x)fdt)  (2.3) 

and  define  a  new  probability  measure  Pu  on  fl  by  ^Jfp-  =  Z$T{x 0).  Then  under  Pu 
(CSU  (xo)*y/)  *s  a  solution  of  (2.1)  and  (2.2),  that  is  (zo)  remains  a  strong  solution  of 
(2.1)  and  there  is  an  independent  Brownian  motion  v  such  that  yt  satisfies  (2.2).  A  version 
of  Z  defined  for  every  trajectory  y  of  the  observation  process  is  obtained  by  integrating  by 
parts  the  stochastic  integral  in  (2.3). 

LEMMA  2.3.  Under  hypothesis  A4,  fort  <  T, 

E\(Zq(  (z0))p)  <  00  for  all  u  €  U  and  all  p,  1  <  p  <  00. 
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PROOF. 


Zo,t  (*0)  =  1  +  f  Zlr  (*0 ))'dyr. 

Jo 

Therefore,  for  any  p  there  is  a  constant  Cv  such  that 

b[(2 <*„))']  <  cp [l  +  £(  jf'  (Zl,  (I0))!/.(S,,  (^o))2dr)P/2  ] . 

The  result  follows  by  Gronwall’s  inequality. 

COST  2.4.  We  shall  suppose  the  cost  is  purely  terminal  and  given  by  some  bounded, 
differentiable  function 

c(fo,r  (zo)) 

which  has  bounded  derivatives.  Then  the  expected  cost  if  control  u  £  £/  is  used  is 

J(u)  =  £.[c((S,t(io))]. 

In  terms  of  P,  under  which  yt  is  always  a  Brownian  motion,  this  is 


JM  =  e[z„V(xo)c(£oV(*o» 


(2.4) 


3.  STOCHASTIC  FLOWS. 


For  ueU  write 


C,<  (*)  =  1  +  J'  fir>  C,r  (x),  ur)dr  +  J  g(r,  (■ x))dwr 


(3.1) 


for  the  solution  of  (2.1)  over  the  time  interval  (s,tj  with  initial  condition  (x)  =  x.  In 
the  sequel  we  wish  to  discuss  the  behaviour  of  (3.1)  for  each  trajectory  y  of  the  observation 
process.  We  have  already  noted  there  is  a  version  of  Z  defined  for  every  y.  The  results  of 
Bismut  [2]  and  Kunita  [10]  extend  easily  and  show  the  map 

C,t  :Rd-Rd 


is,  almost  surely,  for  each  y  €  C([0,T],  Rm)  a  diffeomorphism.  Bismut  [2]  initially  gives 

proofs  when  the  coefficients  /  and  g  are  bounded,  but  points  out  that  a  stopping  time 

argument  extends  the  results  to  when,  for  example,  the  coefficients  have  linear  growth. 

Write  ||r(xo)||t  =  sup  |£q4  (a^j) |.  Then,  as  in  Lemma  2.1  of  [8],  for  any  p, 

0<»<t 

1  <  p  <  oo  using  Gronwall’s  and  Jensen’s  inequalities 

ll£tt(Io)|lr  <  c(l  +  |i0|p  +  |  J  g{r,  £o,r(Io))du;r|P) 


almost  surely,  for  some  constant  C. 

Therefore,  using  Burkholder’s  inequality  and  hypothesis  A3,  ||£“(xo)||r  is  in  If  for 
all  p,  1  <  p  <  00. 

Suppose  u*  €  U  is  an  optimal  control  so  J(u’)  <  J(u)  for  any  other  u  £  U.  Write 

(•)  for  (•).  The  Jacobian  -  —  (x)  is  the  matrix  solution  C<  of  the  equation  for  s  <  t, 
’  ’  ox 


dCt  =  fz(t,  Ca,t  (x),  u')Ctdt  +  £  y«  (t,  C,t  ( x))Ctdw\ 

1  =  1 

with  C,  -  I. 


(3.2) 


Here  J  is  the  n  x  n  identity  matrix  and  yW  is  the  ith  column  of  y.  From  hypotheses  Ai 
and  A3,  fx  and  gx  are  bounded.  Writing  ||Cj|7*  =  sup  |C,|  an  application  of  Gronwall’s 

0  <3<t 
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Jensen’s  and  Burkholder’s  inequalities  again  implies  ||C||r  is  in  IP  for  all  p,  1  <  p  <  oo. 
Consider  the  related  matrix  valued  stochastic  differential  equation 

Dt  =  I-  J*  Drfx(r,  C;>f(x),  K)'dr 

DrgP{r,  $;,r(x ))'dwlr 


n  r* 

-u 


•=i 

+  T  /V(j«(r,  CM)'? dr- 


1=1  Jl 


(3.3) 


Then  it  can  be  checked  that  D*C*  =  I  for  t  >  s,  so  that  Dt  is  the  inverse  of  the  Jacobian, 
that  is  Dt  =  ^ )  •  -A.ga.in,  because  fx  and  gx  are  bounded  we  have  that  ||.D||*  is  in 


every  IP,  1  <  p  <  oo. 

For  a  d-dimensional  semimartingale  Zt  Bismut  [2]  shows  one  can  consider  the  flow 

£*>t(z*)  an<i  gives  the  semimartingale  representation  of  this  process.  In  fact  if  zt  =  z,  + 
n  t 

At+H  /;  H,dw\  is  the  d-dimensional  semimartingale,  Bismut’s  formula  states  that 


»=i 


C,t  [zt)=Z*  +  J'  (/(r,  C,r{zr),  Ur) 

+  Effl°(r,  C,r(*r),  <)' ^  ±  f)  (Hit  Hi))  dr 

f  (r>  C  (*))  +  Zr)Hi)dwl . 


+ 


’  (3.4) 

DEFINITION  3.1.  We  shall  consider  perturbations  of  the  optimal  control  u*  of  the  fol¬ 
lowing  kind:  For  s  E  [0,T),  h  >  0  such  that  0<s<s  +  h<T,  for  any  other  admissible 
control  u  E  U  and  A  E  Y,  define  a  strong  variation  of  u*  by 


I  u(t, 


u*  (t,  w)  if  (t,  tu)  £  [s,  s  +  h]  x  A 


,w)  if  (t,w)  6  [s,s  +  h]  x  A. 

Applying  (3.4)  as  in  Theorem  5.1  of  [4]  we  have  the  following  result. 

THEOREM  3.2.  For  the  perturbation  u  of  the  optimal  control  u*  consider  the  process 

Zt=x  +  J'  (/(r>  C, r(2r).  «r)  - /(r,  e;if(rr),  «;)) dr.  (3.5) 
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Then  the  process  t  (zt)  is  indistinguishable  from  £“t  (x). 


PROOF.  Note  the  equation  defining  z*  involves  only  an  integral  in  time;  there  is  no 
martingale  term,  so  to  apply  (3.4)  we  have  Hi  =  0  for  all  t.  Therefore,  from  (3.4) 

Ce,t  izt)  =  *  +  J  /(ri  Ct,r  (zr),  K)dr 

+  t‘A*r),  nr)  -/(r,  £,(*).  ur)dr 

+  J  9{r,  C,r{Zr))dwT. 

However,  the  solution  of  (3.2)  is  unique  so 

&(*)  =  $(*)• 

REMARKS  3.3.  Note  that  the  perturbation  u{t)  equals  u*  ( t )  if  t  >  s  +  h  so  zt  =  zs+h 


if  t  >  s  +  h  and 


Ze,r{zr)  ~  Zs,t  {zs+h  )  —  £«+fc,t  [Zs,t+k  (Z))' 


Va'.VV.V.V 


4.  AUGMENTED  FLOWS. 

Consider  the  augmented  flow  which  includes  as  an  extra  coordinate  the  stochastic 
exponential  Z\t  with  a  ‘variable’  initial  condition  z  G  R  for  Z’e(-).  That  is,  consider  the 
( d  +  1)  dimensional  system  given  by: 

C,t  [x)  =  x  +  J'  f(r,  Cs,r  (*),  K)dr  +  J  g{r,  ft<r  (x ))dtvr 
zljt  {x,z)  =  z  +  I*  Z\r  (x,  z)h(Ce<r  ( x))'dyT . 

Therefore, 

Zl,t  (*t  Z)  =  zZlr  (*) 

=  ZeXP{jf  fc(£,r  (Z))'dyr  ~\f  fc(^,r(Z))*rfr) 

and  we  see  there  is  a  version  of  the  enlarged  system  defined  for  each  trajectory  y  by  inte¬ 
grating  by  parts  the  stochastic  integral.  The  augmented  map  (x,  z)  — ►  (£*f  (x),  Z*t  (x,  z)) 
is  then  almost  surely  a  diffeomorphism  of  Rd+1 .  Note  that  =0,  =  0  and 

|^-  =  0.  The  Jacobian  of  this  augmented  map  is,  therefore,  represented  by  the  matrix 


and  for  1  <  *  <  d.  from  equation  (3.3) 

mum  ^rt(^,  ,w(e,(*))  dek,.,r(x) 

^  - 

+*’'<«;,  (<1) 

(Here  the  double  index  k  is  summed  from  1  to  n). 

We  shall  be  interested  in  the  solution  of  this  differential  system  (4.1)  only  in  the 
situation  when  z  =  1  so  we  shall  write  £,*f(x)  for  Z‘t(x,  1).  The  following  result  is 
motivated  by  formally  differentiating  the  exponential  formula  for  Z‘t  (x). 


Lemma  4.1. 


dZljt  (*)  7I  ,  ,/  /‘t  ^  Kiri*)  j  \ 

—ar~  =  z-^l){J,  ME  .a*))  ■ —§r~ 'iVr) 

where  v  =  (v1 , . . . ,  un)  is  the  Brownian  motion  in  the  observation  process. 

dz*  (z] 

PROOF.  From  (4.1)  we  see  '  is  the  solution  of  the  stochastic  differential  equa¬ 


tion 


Write 


where 


Because 


dZa,tix)  f*  ( dZt,r{.x)  ,i  /  /  \\  rr*  /  \l  ( <■*  t  s\ ^^e,r  C1)  x  ,  0\ 

- =  J  { - Yx - k  +  Z»,r(X)h*\Z',r(X))  -Q~-)dyr.  (4.2) 


dx 


dyr  =  h{Cs<t{x))dt  +  dvt. 


K,  (*)  =  i  +  J  K,  (*)*'(«,  (*))<<* 


the  product  rule  gives 


L„(x)  =  J‘  Zir(x)h.-^-dv, 

+  f.  (/'  h‘  '  dv’)z:'r  (l)','(E<V 

=  I*  L,tr(x)h’{C,,(x))dyr  +  j*  Z‘,  (x)h,  ■  -dy„ 

Therefore,  L,  t  (x)  is  also  a  solution  of  (4.2)  so  by  uniqueness 

,  ,  dz:t(x) 

REMARKS  4.2.  As  noted  at  the  beginning  of  this  section  we  can  consider  the 
augmented  flow 


(*>*) 


(C,<  (*)>  Ztjt{x,z)) 


for  x  6  z  £  R, 


and  we  are  only  interested  in  the  situation  when  z  =  1,  so  we  write  Z*t  (x). 


LEMMA  4.3.  Z*t  ( z< )  =  Z“t(x)  where  zt  is  the  semimartingale  defined  in  (3.6). 

PROOF.  Z*t  (x)  is  the  process  uniquely  defined  by 

z'„  w  =  i  +  /'  z;,Wi'(E:,w)4.  t*-2) 

Consider  an  augmented  ( d  +  l)  dimensional  version  of  (3.6)  defining  a  semimartingale 
zt  =  (z*,l),  so  the  additional  component  is  always  identically  1.  Then  applying  (3.5)  to 
the  new  component  of  the  augmented  process  we  have 

^»,r(zr)  =  1  +  J  Z*T(zr)h' (£sr(zr))dyT 
—  1  +  J  T  [zT)h  (£“,r  lx))dyr 

by  Theorem  3.2.  However,  (4.2)  has  a  unique  solution  so  Z*t  (zt)  =  Z"t  (x). 

REMARKS  4.4.  Note  that  for  t  >  s  +  h 


z;,t  izt)  —  z»,t  iz>+h  )• 


5.  THE  MINIMUM  PRINCIPLE. 


Control  u  will  be  the  perturbation  of  the  optimal  control  u*  as  in  Definition  3.1.  We 
shall  write  x  —  (zo)-  Then  the  minimum  cost  is 


J(u*)  —  E[ZqT  (x0)c(£qT  (ro))] 

=  (*))]. 


The  cost  corresponding  to  the  perturbed  control  u  is 


•/(«)  =  J5|Z,U*o)Z.V  (*)*(£.>  M)1 
=  e[z1 (xo  )z;,T  (*,+»  Me.v  (*.+»  ))i 

by  Theorem  3.2  and  Lemma  4.3.  Now  Z*T  (•)  and  c(£*r  (•))  are  differentiable  with  con¬ 
tinuous  and  uniformly  integrable  derivatives.  Therefore 


j(«)  -  j(u-)  =  e\z (xo  )(z;,T  (x.+»  )c(e„  (x,+k ))  -  z;,r  (x)c(e;,,.  (*)))| 

=  E[j  r  (<,Xr)(/(r,  Ur)  -  /(f,  «?))* 


where 


r(*,*r)  =  2o%(Xo)Z,V(Xr){c{(£,V(Xr))^£^  + 

«(£.>  (*))(  £  MS,  (*r))^f- (*)*,)  }  (*))“  • 


Note  that  this  expression  gives  an  explicit  formula  for  the  change  in  the  cost  resulting  from 
a  variation  in  the  optimal  control.  The  only  remaining  problem  is  to  justify  differentiating 
the  right  hand  side. 

From  Lemma  2.3,  Z  is  in  every  IP  space,  1  <  p  <  oo  and  from  the  remarks  at  the 
beginning  of  Section  3,  Cj  =  9^f~  and  Dt  =  ^  j  are  in  every  LP  space,  1  <  p  <  oo. 
Consequently,  T  is  in  every  LP  space,  1  <  p  <  oo. 


Therefore 


J{u)  -  J(u*)  =  E[{T{s,zr)-T(s,x)){f(r,  £(,(zr),  ur)  -  /(r,  C,r  (2r),  <))]* 

+  I*  £[(r(s,x)  -  r(r,i))(/(r,  ^*if(zr),  ur)  -  /(r,  C,r(2r),  <))  dr 
+  J'  E [r(r,i)(/(r,  ur)  -  /(r,  C,r(2r),  <) 

-/(r.  €,r(x),  ur)  +  f(r,  Cff(i),  «;))]dr 

+  f  £[r(r,l)(/(r,  £o,r  (*<>),  Uf )  -  f(r,£o.r(xo),  «;))]dr 

J  A 


=  h  (h)  +  h  [h]  +  h  [h]  +  I4  (h),  say. 


Now, 


\h(h)\  <  Kx  p"  i?[|r(s,zf)  -  r(5,x)|(X  +  ||ett(*o)||4+A)]dr 
<Kxh  sup  £[|r(s,2r)-r(s,x)[(i  +  ||r(xo)IUfc)l 

s<r<e+h  L  ^ 

I MMI  <  K2  J'+h  f;[|r(s,x)  -  r(r,x)|(l  +  ||*u(*o)||.+fc  )\dr 

<  K2h  sup  £[|r(s,2r)  -  r(r,x)|(l  +  ||r(io)||»+J 

e<r<t+h  ^  ■* 

l*(MI  <  K*r  -B[|r(r,i)l  llx-J.ll]* 

<  Kzh  sup  £[|r(r,x)]  ||x  -  z.\\,+h  ■ 

><r<s+h  *■ 


The  differences  |r(s,zr)  -  r(s,x)|,  |r(s,x)  -  r(r,x)|  and  ||x  -  z||,+/,  are  all  uniformly 
bounded  in  some  IP ,  p  >  1,  and 


lim  |r(s,zr)  —  r($,  x)|  =  0  a.s. 
lim  |r(s,x)  —  r(r,x)|  =  0  a.s. 

r—*s 

Jim  j  I  x  -  z.\\t+h  =  0. 


!■*  I’*  »'»  I 


Therefore, 


Jim  \\T(s,Zr)  -  r(a,x)||p  =  0 
lim||r(5,x)  -r(r,i)|jp  =0 

and  lim  ||(||x  -  z\\B+h  )||p  =  0  for  some  p. 

h— K) 

Consequently,  lim  h~x  h[h)  =  0,  for  k  =  1,2,3. 
h—*0 

The  only  remaining  problem  concerns  the  differentiability  of 

IiC 0  =  /  £[r(r.i)(/(r,  {£,(*„),  «,)  -  /(r,  «,(*,).  <))]*. 

The  integrand  is  almost  surely  in  LJ((0,  Tj)  so  lim  hr 1  exists  for  almost  every  s  £ 
[0,T].  However,  the  set  of  times  {s}  where  the  limit  may  not  exist  might  depend  on  the 
control  u.  Consequently  we  must  restrict  the  perturbations  u  of  the  optimal  control  u*  to 
perturbations  from  a  countable  dense  set  of  controls.  In  fact: 

1)  Because  the  trajectories  are,  almost  surely,  continuous,  Yp  is  countably  generated 
by  sets  {>!,>},  *  =  1,2,...  for  any  rational  number  p  £  [0,T].  Consequently  Yt  is 
countably  generated  by  the  sets  {Aip},  r  <  t. 

2)  Let  Gt  denote  the  set  of  measurable  functions  from  [VI,  Yt)  to  U  C  Rk.  (If  u  £  U 

then  u[t,w)  €  G*.)  Using  the  L1-norm,  as  in  [5],  there  is  a  countable  dense  subset 

Hp  =  {«„}  of  Gp,  for  rational  p  £  [0,7'].  If  Ht  =  (J  Hp  then  Ht  is  a  countable 

p <t 

dense  subset  of  Gt.  If  £  Hp  then,  as  a  function  constant  in  time,  uJP  can  be 
considered  as  an  admissible  control  over  the  time  interval  [t,T]  for  t  >  p. 

3)  The  countable  family  of  perturbations  is  obtained  by  considering  sets  A,fi  £  Yt, 

functions  uyp  £  Ht,  where  p  <  t,  and  defining  as  in  3.1 

f  ti'(j,iu)  if  [s,w)  £  [t,T]  x  Aip 
1  \  uy,(s,tt;)  if  (s,u;)  £  [f,T]  x  Aip. 


Then  for  each  i,j,p 


Km  h  1  J  E[T[r,x)[f[r,  &>r(*0),  «y p)  ~  f[r,  Zo,r  (*o),  u*))]dr  (5.1) 


exists  and  equals 


£[r(s,  *)(/(«,  £o,,(*o),  U;p)-/(s.  £o,*  (*o)»  «*))/** 
for  almost  all  a  €  [0, T\. 

Therefore,  considering  this  perturbation  we  have 

Jim  /i_1(J(u*,)  -  J(u*))  =  £[r(a,x)(/(a,  Co,*  (*<>),  uJP)  -  /(a,  Co,*  (*o).  u*)Ka„ 

>  0  for  almost  all  a  6  [0,T]. 


Consequently  there  is  a  set  S  C  [0,  T1]  of  zero  Lebesgue  measure  such  that,  if  a  $  S,  the 
limit  in  (5.1)  exists  for  all  and  gives 


E  F(s,  i)(/(a,  Co,,(*o),  tt;>)-/(a,  Co,*  (x0),  ^))I\P 


>  0. 


Using  the  monotone  class  theorem,  and  approximating  an  arbitrary  admissible  control 
u  €  f/  we  can  deduce  that  if  a  ^  5 


£;[r(s,*)(/(s,  Co,* (*o),  u)-/(a,  C^(xo),  u-))/x]  >0 


for  any  A  €Yt. 


(5.2) 


Write 


v. w  =  e ■  [C((£-x(I0))^fl  +c(a,T(*o))(J'  hc(a„{l0))^f^-dv,)  i  nvw] 

where,  as  before,  x  =  Co,*  (xo)  and  E*  denotes  expectation  under  P*  =  Pu  .  Then  p,(x) 
is  the  co-state  variable  and  we  have  in  (5.2)  proved  the  following  ‘conditional’  minimum 
principle: 


THEOREM  5.1.  If  u*  €  U  is  an  optimal  control  there  is  a  set  S  C  [0,7’]  of  zero  Lebesgue 
measure  such  that  if  s  $  S 


E‘[p,{x)f{s,x,ut)  ]  Y,\  <  E'\p,  (i)/(a,x,u)  |  K,]  a.s. 


That  is,  the  optimal  control  u*  almost  surely  minimizes  the  conditional  Hamiltonian  and 
the  adjoint  variable  is  p»(x). 
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6.  CONCLUSION. 

Using  the  theory  of  stochastic  flows  the  effect  of  a  perturbation  of  an  optimal  control 
is  explicitly  calculated.  The  only  difficulty  was  to  justify  its  differentiation.  The  adjoint 
process  is  explicitly  identified  as  p*(x). 

THEOREM  6.1.  If  f  is  differentiable  in  the  control  variable  u,  and  if  the  random  variable 
x  =  t  (xo)  has  a  conditional  density  q»(x)  under  the  measure  P* ,  then  the  inequality  of 
Theorem  5.1  implies 

Y^(uy(s)  -  u*(s))  f  r(s,x)~-(s,x,ut)qe(x)dx  <  0. 

JR*  °ui 


This  is  the  result  of  Bensoussan’s  paper  [1]. 

The  method  of  this  paper  can  be  applied  to  completely  observable  systems  by  ini¬ 
tially  considering  ‘stochastic  open  loop’  controls,  systems  with  stochastic  constraints  and 
deterministic  systems.  The  adjoint  process  can  be  explicitly  identified.  ‘Almost  minimum’ 
principles  for  ‘almost  optimal’  controls  can  be  obtained.  Some  of  these  will  be  discussed 
in  later  work. 


«  I  t  i  .‘I  »»4j 


i<l  Jij*!  .’it'l.U  .'liH.nl.’>  »**u  i  »*  »  (Vla'M'kl'M  lata^Lt’laia'ta'Ia  >.i 
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